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Abstract 

Backtracking is a basic strategy to solve constraint satisfaction problems (CSPs). A 
satisfiablc CSP instance is backtrack-free if a solution can be found without encountering 
any dead-end during a backtracking search, implying that the instance is easy to solve. 
We prove an exact phase transition of backtrack-free search in some random CSPs, 
namely in Model RB and in Model RD. This is the first time an exact phase transition 
of backtrack-free search can be identified on some random CSPs. Our technical results 
also have interesting implications on the power of greedy algorithms, on the width 
of superlinear dense random hypergraphs and on the exact satisfiability threshold of 
random CSPs. 
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1 Introduction 



In constraint satisfaction problems (CSPs), values are assigned to variables to fulfil con- 
straints among these variables [HI HSj. Backtracking is a basic strategy to solve CSPs 
[151 [lllllH Ul [161 132] . A CSP instance is called backtrack-free, if we can always extend from 
scratch a partial assignment to a solution without any reassignment (or backtracking) along 
a linear ordering on variables, and at each variable we only need to keep the extended partial 
assignment compatible with these constraints among assigned variables, implying that the 
instance is easy to solve [17] . In practice, backtrack-freeness is a very desirable property 
in many applications [26l [H [25l |45l [6] . In theory, sufficient conditions and on random 
instances for backtrack-freeness have been studied [T7l[T0l[i6l[T2llT7l[i0l[26l[33llM[T3]. 
Here, we study backtrack-freeness from a theoretical point of views along these two lines. 

The sufficient conditions for backtrack-freeness on CSPs were given by Freuder in terms of 
strong consistency and the width of constraint graph [171 \TE[ [19] , by van Beek and Dechter 
in terms of local and global consistency [101 146j. constraint tightness and looseness ^47j, by 
Dakic et al in terms of overlap of cliques in interval graph representation [12], by Jackson et 
al in terms of k-consistency and overlap in constraint graphs [26j , by Pang and Goodwin in 
terms of w-consistency and tree-structured w-graph associated with constraint hypergraphs 
[40], and by Kolaitis and Vardi in terms of /c-locality [33]. Here, yet another sufficient 
condition in terms of what we call vertex- centered consistency and the width of constraint 
hypergraph is given. 

A non-zero probability of backtrack-freeness on random instances for a range of parameter 
values was used by Smith to lower bound the satisfiability threshold [44]. Dyer, Frieze 
and Molloy obtained a threshold for backtrack-freeness with respect to the parameter of 
the domain size of binary CSPs with a linear number of constraints [13]. Here we identify 
an exact threshold of backtrack-freeness with respect to the density parameter for non- 
binary CSPs with a superlinear number of constraints. This is the first time an exact phase 
transition of backtrack-freeness can be identified on random CSPs. Before, the exact phase 
transition results of algorithmic behaviors are rare and mainly about resolution [11 136]. 

Our proofs work by first showing a phase transition result about variable-centered consis- 
tency and then estimating the width of a random hypergraph by determining the existence 
of specific /c-cores. As far as we know, this is the first fe-core result on fc-uniform hypergraphs 
with rnlnn hyperedges and n vertices. In our case, the width increases smoothly with the 
density parameter, in sharp contrast to the earlier fc-core threshold results in literatures for 
sparse hypergraphs [ilMlSIllIllHllMllllllMlllTllMllSlllllin]. 

Our results have implications on the power of greedy algorithms, since below the backtrack- 
freeness threshold we can find a solution in a greedy manner for almost all instances, while 
above the threshold we are forced to search with backtracking for almost all instances, even 
for satisfiable instances. To this end, we define the width of greedy algorithms. Also, our 
results show that for Model RB/RD, the satisfiability threshold and some local property 
threshold are linked tightly, so we suggest that a similar link might exist for random 3-SAT. 

This paper is organized as follow. In Section [2] we fix our notations and give all neces- 
sary definitions and some known results. In Section [3] we show the exact phase transition 
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of backtrack- freeness. In Section S] we show results about width and A;-cores in random 
hypergraphs. In Section [5] we discuss some imphcations of our results. 

2 Preliminaries 

In constraint satisfaction problems (CSPs), a set of variables {ui,U2, ■ ■ ■ ,Un} and a set of 
constraints {Ci, C2, • ■ ■ , Cm} are given for each instance. We call n the input size and ratio 
^ the constraint density. Each variable can take a value from a finite domain {1, 2, ■ • • ,d}. 
We allow d to increase with n, say d = n°, where a is a constant. An assignment is a 
mapping from the variable set to the domain and a partial assignment is a mapping from 
a variable subset to the domain. Each constraint involves a subset of variables and labels 
each partial assignment on these variables either as compatible or incompatible, but not 
both. In so called fc-CSPs, each constraint involves k variables. 2-CSPs are also called 
binary CSPs. An assignment compatible with all constraints is called a solution. Instances 
with at least one solution are called satisfiable, otherwise unsatisfiable. 

In random CSPs, constraints are generated by a random process with a small number of 
control parameters, leading to a probabilistic distribution on all instances. In Model RB, 
given n variables each with domain {1, 2, d}, where d = n°' and a > is constant, select 
with repetition m = rnlnn random constraints, for each constraint select without repetition 
k oi n variables, where k = 2,3,4,..., and select uniformly at random without repetition 
(1 — p)d^ compatible assignments for these k variables, where < p < 1 is constant. If in 
the last step above, each assignment for the k variables is selected with probability 1—p as 
compatible independently, then it is called Model RD (^48j). Model RB is asymptotically 
similar to Model RD just as G{n,M) is to G{n,p), all asymptotic results should hold both 
for Model RB /RD ( [48^ [39] ) . For simplicity, here we only give proofs valid for Model RD 
and omit more complicated calculations for Model RB. For Model RB/RD, not only exact 
satisfiability thresholds can be identified [48j but also the existence of many hard instances 
around the thresholds can be demonstrated both theoretically and experimentally [50j . 

Theorem 2.1. (I4^h Theorem 1) Let Vcr = — in(f_p) ; where a>^, 0<p<l are constants 
and k > . Then for a random instance (j) in Model RB/RD, 

lim Pr((/) is satisfiable ) = i 

Theorem 2.2. (1491, Theorem 3) Almost all instances in Model RB/RD have no tree-like 
resolutions of length less than 2^^"''^ and no general resolutions of length less than 2^("'/'^). 

In graph theory, a hypergraph consists of some nodes and some hyperedges. Each hyperedge 
is a subset of nodes. A hypergraph is k-uniform if every hyperedge contains exact k nodes. 
Every CSP has an underlying constraint (multi-)hypergraph: each variable corresponds to 
a node and each constraint corresponds to a hyperedge in a natural way. The constraint 
hypergraphs of random CSPs are random hypergraphs '29]. The constraint hypergraph of 
Model RB /RD, denoted by HG{n, rn In n, k), is a random /c-uniform multi-hypergraph with 
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n nodes and rnlnn hyperedges, where r is constant and k = 2,3,4, .... Denote by HG a 
random hypergraph from HG{n, rnlnn, k). 

Let (p be an instance of CSPs. Let u be a variable. Let C be a constraint involving u, where 
C is called a n-constraint. For any u, the total number of -u-constraints is called the degree 
of u and denoted as deg{u). Let Cu be a set of u-constraints, where Cu is called u-centered. 
Denote by Nc^ the set of all variables involved in constraints in Cu- Denote by C\„ the 
set of all constraints among variables in Nq^ \ {u}. Denote by Tc^^ the set of all partial 
assignments each compatible with all constraints in C\u- Let c be a partial assignment in 
Tc^_^ . Let f be a value to u. Denote by c' the partial assignment extending c just with 
u = V. 

Let TT be a linear ordering on variables in (p, say ui < U2 < • • • < Un- Denote by C^. the set 
of all lij-constraints such that all constraints in C^. are among {ui,U2, • • • ,Ui}. The width 
of Ui under tt is just \C^.\- The width of vr is maxi width{ui) , denoted by width{'K). The 
width of is min^r ?i;idt/i(7r), denoted by width((p). For constraint hypergraphs, the degree 
and width can be defined in a similar way. The width and the associated optimal linear 
ordering can be found efficiently |17lll8t [T9 | l38j. Moreover, the linkage of a hypergraph HG 
is the minimum degree of all its nodes, denoted by linkage{HG). A k-core of a hypergraph 
is a nonempty maximal subgraph with minimum degree k. In [T7j, it was essentially proved 
that the width of a hypergraph is equal to the maximal linkage of its subgraphs. 

Consider the following strategy to solve (p. At step 1, we put an arbitrary value to ui. 
Assume that after step i — 1, we have a partial assignment c on {ui,U2, • • • , lij-i} which is 
compatible with all constraints among {ui,U2, ■ • • At step i, we find a value v for 

Ui such that, when c is extended with Ui = v, the resulting assignment c' is compatible to 
all constraints among {ui,U2, ■ - - ,Ui}. Such a w is called available. When there are more 
than one available v^s, we take an arbitrary one from them. Note that the only requirement 
to V is that, when c is extended with Ui = v, the resulting assignment c' is compatible 
with all constraints among {ui,U2, • • • In fact, the only requirement for v is that c' 

is compatible with all constraints in C^.. If at each step i {1 < i < n), for every partial 
assignment c, we can always find such a value v for Ui, then we say that <j) is backtrack-free 
under vr. Otherwise, we say that (j) is not backtrack-free under vr. If there is a vr such that 
(p is backtrack-free under vr, then we say that (p is backtrack-free. 

If whenever |C„| < t , then for every c E ^C\u) we can always find a v such that c' is 
compatible with all constraints in Gu (that is, for all C € C^, c' is compatible with C), 
then we say that u is variable- centered t-consistent. If every u in an instance is variable- 
centered t-consistent, then we call this instance variable-centered t-consistent and t is called 
the critical size of this variable-centered consistency. 

Denote by E(X) the expectation of a random variable X, B(n,/>) the binomial distribution, 
Pr(^) the probability of event C An event S; occurs with high probability, or whp, if 
lim„^ooPr(2) = 1. 

Lemma 2.3. ( Chernoff Boundj/^ \37[ [23, For a random variable X with distribution 
B{n,^) andO<e<l, we have Pt{X < (1-e)^) < e'^^'/^ andFT{X > {l+e)fi) < e-^"'/^ 
and for any fih > fi, Pr(X > (1 + e)fih) < e"^''^^/^. 

Finally, f <^ g means / = o{g) or lim„_»oo f ~ ^ useful inequality is 1 — x < e~^ < 
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1 — X + o{x) for small x > 0. 



3 The exact threshold of backtrack-freeness 

In this section we give the exact threshold of backtrack-freeness for the Model RB and 
Model RD. We first give a sufficient condition for backtrack-freeness. 

Note: In this section, when we use 0, vr, C^, u, Cu, Nc^-, C\tt, Tc-^^-, c, v, d and C, we 
implicitly assume that they adhere to the descriptions in Section [2l 

Theorem 3.1. //(/> is vertex- centered width{(j)) -consistent, then (p is backtrack- free. 

Proof. By definition of backtrack-freeness, clearly 

cj} is backtrack-free ^ Bvr, Vu, Vc G Tq-^ , 3f , VC S C^, c' is compatible with C. 

By definition of width, there is a vr such that width{(j)) = width{'K). Under vr, for all 
nj, width{ui) < width^-K) = width{(f)) . Then the vertex-centered tt;i(ii/i(i;^)-consistency 
guarantees that at each Uj, the partial assignment can be extended as desired by backtrack- 
free search. □ 

As a warm up, we upper bound the number of u-constraints for any u as O(lnn). 
Lemma 3.2. in&yiudeg{u) < (1 + ^J^)krlnn whp. 

Proof. Since the total number of constraints is rnlnn, every constraint involves exactly k 
vertices, and a given vertex appears in a constraint with probability ^, deg{u) is a ran- 
dom variable with binomial distribution B(rnlnn, ^). By Chernoff bound, for any u we 

have I'r{deg{u) > (1 + ^J'^)kr\nn) < By Union bound, we have FT(3u,deg{u) > 
(1 + \li)kr\nn) < n • = i, so Pr(Vn,de5(n) < (1 + ^)kr\nn) > 1 - ^, that is, 
va.ax.udeg{u) < (1 + ^J~^)kr\'n.n whp. □ 

Our main observation is that there is a threshold for density parameter r in Model RB/RD, 
such that below this threshold, almost all instances are variable-centered consistent for 
some critical size, while above this threshold, almost all instance are not variable-centered 
consistent for another critical size. Happily, the two critical sizes can be very close! 

Lemma 3.3. Let rf,f = ~ fcin(°_p) , where a > 0, < p < 1, k = 2,3,4, ... are constants. 
If r < rbf, < e < min( ''''-^^ ^ , ^) and t = (1 + e)krlnn, then Pr(Vu,ii is vertex- centered 
t-consistent ) > 1 — e""*^*^' . 

Proof. Given u, Cu, c, v, C and d as described in Section [2] and only consider C^'s with 
u is vertex-centered t-consistent ■<4> \/Cu,'^c, 3^, VC, d is compatible with C. 
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Under the distribution on random instances of Model RD, we have 

Pr(c' is compatible with C) = I — p, 

Pr(VC, c' is compatible with C) = (1 - 

Pr(3C, c' is incompatible with C) = 1 - (1 - p)!*^"!, 

Pr(V?;, 3C, c' is incompatible with C) = (1 - (1 - p)\^^Y- 

To apply the Union bound on u, Cu and c, we only need to upper bound (1 — (1 — p)!'^"!)'^ 
and the number of choices of u, Cu and c respectively. To upper bound (1 — (1 — p)'^"')'*, 
recall that e < ^'^^^ ^ , denote 5 = ri,f — (1 + e)r > and 7 = —6kln{l — p) > 0, then 
\Cu\ < t = {1 + e)kr\nn = {nj - 5)k\nn = - 5A;)lnn = -jj^^lnn, so we 

have (1 - {l-p)\^-\Y <(!-(! - p)"^^ ^''")"" = (1 - n-°+^)"" < (e-"""+^)'^" = 
e""^ = 6"""°*^', the last inequality is by 1 — x < for x 7^ 0. The number of possi- 
ble choices of u is no greater than n = e'"^". By lemma 13.21 for any n, the total num- 
ber of u-constraints is deg{u) = O(lnn) whp, so the number of possible choices of Cu 
is no more than 2*^^^^") = e'^^''^"^ whp. For any C„, the number of variables in Nq^ 
is no more than A;|Cu|, since each constraint includes exactly k variables. Each vari- 
able can take at most d = different values, so the number of possible choice of c 
is iTc^J < < < {n'^f\Cu\ < „h = ^O(inn) ^ gO({inn)2)_ gy Union 

bound, we have Pr(3ti, 3C„, 3c, Vw, 3C, c' is incompatible with C) < g'^^^-e'^^i'^") •e'^^C'"")') • 
g-nO(i) _ g-J^^'^'^ By taking complement, we have Pr(u is vertex-centered t-consistent) = 
Pr(Vu, VC„, Vc, 3t;, VC, c' is compatible with C) > 1 - e"""'" . □ 

Lemma 3.4. Let r^j = ~ fcin(x-p) ' where a > 0, < p < 1, A; = 2,3,4, ... are constants. If 
r > rijf, < e < min(^!-^, i), (5 = (1 - e)r - r^f > 0, j = -6k ln(l - p) > anrf t = (1 - 
e)kr In n, t/ien for all u and for all Cu with \Cu\ >t, Pr(Vc, 3v, VC, c' is compatible with C) < 

Proof. As in proof of Lemma 13.31 but only consider C„'s with |C„| > t, 

Pr(Vu, 3C, c' is incompatible with C) = (1 - (1 - p)''^"')'^, 
Pr(3t;, VC, c' is compatible with C) = 1 - (1 - (1 - p)''^"')'^, 

Pr(Vc,3t;,VC,c' is compatible with C) = (1 - (1 - (1 - p)l'^"l)'^)''^''\«'. 

This time we only need to lower bound (1 — (1 — p)l*^"l)'^ and iTc^^^J. To lower bound 

(1 - {l-p)\^'^\y, recall that e < S = {1 - e)r - rbf > and 7 = - 6k - p) > 0, 

then \Cu\ > t = (1 — e)krlnn = (5 + ri,f)k\nn = (6k — j^^^^3^)lnn = ^^^^^\nn, so 

(l-(l-p)l'^"l)'^ > (l-(l-p)i^^ = (l-n-°-T)"° f« e-""\the last approximation 
is by (1 — ~ K To lower bound l^b^^l, recall that C\u denote the set of all constraints 
among variables in Nc^ \ {u} and 

so we only need to upper bound \C\u\ and to lower bound lA''c„|. 



6 



To upper bound \C\u\^ we only need to upper bound jA'^c'uli since each constraint in |C\„| 
is among variables in Nq^ \ {u}. In turn, we only need to upper bound \Cu\, since each 
variable in Nc^ is contained in some constraint in Cu and each constraint contains exactly 
k variables. By Lemma 13.21 \Cu\ = O(lnn) whp, so \Ncy^\ < k\Cu\ = O(lnn) whp. 
Since each constraint contains exactly k variables, the probability that a given constraint 

is among iVc„ \ M is ^ ' < ^jC^ < = (^^)^ Since the total number of 

Vfej Vfcj ^ 

constraints is rnlnn = 0(n Inn), we have E(|C\„|) < ( '^^^^ )^ • 0(ra In n) = = o(l) 

for /c > 2. By Markov inequality, Pr(|C\„| > 1) < E(|C\J) = o(l), so \C\J = whp. 

To lower bound [A'^c'ul) the number of variables involved in constraints in Cu, we only 
need to upper bound the probability that a variable does not appear in any constraint in 
Cu- Since each constraint includes exactly k variables, a variable appears in a constraint 
with probability ^, not appears in a constraint with probability 1 — ^, and not appears in 

all constraints in Cu with probability (1 — < (e^")* = e^^ < 1 — ^ + o(^), using 

1-x < e-^ < l-x + oix) for X 7^ and \Cu\ > t. SoE(|7VcJ) = ^^[1 - (1 - > ri-{^- 
o(M)) = /ct- o(lnn), since t = O(lnn). By Chernoff bound, Pr(|iVc„| < (1 - ^)kt) = o(l), 
so \Nc^ \ > (1 - (-)kt v^^hp. 

Now we have 

EdTc^J) = (1 > {l-pf . = whp. 

By the second moment method similar to that in [38], we can prove that iTb^^l > n^^'"^") 
whp. So Pr(Vc,3^;,VC7,c' is compatible with C) = (1 - (1 - (1 - p)\Cu\)d^\'^c^J < (l _ 



Finally, we can prove the exact phase transition of backtrack-freeness on Model RB/RD. 

Theorem 3.5. Let rbj = ~ fcin("_p) ; where a > 0, < p < 1, k = 2,3,4, ... are constants. 
Then 

lim Pr((/) is backtrack-free ) = i 

I r > rhf. 



Proof. If r < r^j, let < e < min( ''''^^ ^ , ^). From Lemma 13.31 (f> is vertex-centered 
(1 + e)/cr Inn-consistent whp. From Lemma 14. !( width{(p) < (1 + €)kr\nn whp. By 
definition, for t' < t, vertex-centered f-consistency implies vertex-centered t'-consistency, so 
(j) is vertex-centered u;idi/i(i;^)-consistent whp. By Theorem 13.11 (p is backtrack-free whp. 
This completes the first half of our proof. 

If r > ri)f, let e < mm{^—^, ^). By Lemma l4.2( for any vr, width{TT) > (1 — e)krlnn whp, 
so exists a u such that [C^[ > (1 — e)kr\nn. By Lemma |3.4| for any u, 

Pr(Vc G Tc- ,3v,yC G C^,c' is compatible with C) = n-T'^""""'. 
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Since the number of choices of tt is n!, by Union bound, 

Pr(</) is backtrack-free ) < n!Pr(Vu, Vc G Tq^ ,3v,\/C G C^,c' is compatible with C) 

< n\ Pr(Vc G Tc- , 3v, VC7 G C^, c' is compatible with C) 

< n! • n ^ '^{—y-n = 

e 

This completes our proof. □ 



4 Width of random hypergraphs 

In this section we determine the width of some random hypergraphs with a superlinear 
number of hyperedges. We apply a probabilistic method mainly inspired by [iHl [35] to 
detect the existence of A;-cores. Denote by HG a random hypergraph from HG{n, rn In n, k). 
We show that whp the width of HG, denoted as width(HG), is asymptotically equal to 
average degree fcrlnn, due to high concentration of distribution of node degree in HG. 

Lemma 4.1. For any < e < 1, width{HG) < (1 + e)A;r Inn whp. 

Proof. The number of hyperedges in a subgraph G' C HG is a random variable Xq' ■ If G' 
has /(n) nodes, when adding a hyperedge to HG with repetition, the value of Xq' increases 

by 1 with probability ('^\ , so Xq' distributes as B(rnlnn, (n\ ), and 

Vfej \k) 

E{Xqi) = rnlnn ■ ^ < r Inn • /(n) < (1 + e)rlnn • /(n). (1) 
\k) 



Let avd{G') denote the average degree of G' . By ([T|) and Chernoff Bound, we have 



Pi{avdiG') > (l+e)A:rlnn) = Pr(XG' > (l+e)r lnn-/(n)) < g-'^i'^^ /W-^'/^ = n 



re2/3-/(n)_ 

(2) 



Let random variable Ni = \{G'\ subgraph G' has i nodes A avd[G') > (1 + e)kr\nn > 1}| 
and = A'^i + + ••• + -^n- Since the width of a hypergraph is equal to the maximal 
linkage of its subgraphs [17j, we have 

Vv{width{HG) > (l + e)A;rlnn) = Pr(3G' C HG,linkage{G') > (l + e)A;rlnn) 

< Pr(3G' C HG,avd{G') > (l + e)A;rlnn) < Pv{Ni + N2 + ... + Nn > 1) < E(iV). (3) 

Below we show that E(A^) tends to by showing that E(A'^^(„)) = o(l/n). 
Case 1. When /(n) is large, namely n^-^'^^/^ <^ /('t-) ^ since by ([2]), we have 

mnn)) < f • n-V3./(n) < ^ - . ,-«V3./(n) ^ = o(l/n). 

\f[n)J f[n) f[n) 
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Case 2. When f{n) is small, that is f{n) <C n, since by ([T|), for all i > (1 + e)rlnn • /(n), 
we have Pr(aud(G') = i) < PT:{avd{G') = (1 + e)krlnn), so 

Fr{avd{G') > {l+e)krlnn) < nPT{avd{G') = {l+e)krlnn) = nFi{XG' = (l+e)r In n-/(n)) 



< n 



rn In n ^ ' 



(1 + e)r Inn • /(n 



(- ('^fc ) \(l+£)rlnn-f(n) ^ rnlnn V/OlO\fc(l+e)rlnn-/(n) 

^ (^) ^ - \{l+e)rlnnj^ n ' 



< ernlnn N(l+,)r in n-/(n) _ / /("-) Nfc(l+€)rlnn-/(n) ^ _ f^^K Cifjn) Inn 

~ (1 + e)r Inn • /(n) n n ' 

where Ci > and C2 > are two constants. Then, 



E(iV/(n)) < 



n 



(Ci ■ li!!l\C2f{n)\nn ^ (_f!L)/W^(C7^ . li^\C2 f (n) \n n 

m 



J{n)J n f{n) n 

n 

where C[ > and C2 > are two constants. 

The above two cases already overlap each other, so we can upper bound E(A^) as 

E(iV)< E(iV/(n))+ E E(iV^(„)<2n.o(l/n) = o(l). (4) 

/(nXn /(n)>ni-'-'V3 

The lemma follows from Q and (jH). □ 
Lemma 4.2. For any < e < 1, width{HG) > (1 — e)krlnn whp. 



Proof. Let m = (1 — e)A;rlnn. Since the width of a hypergraph is equal to the maximal 
linkage of its subgraphs [U], we need to prove the existence of a subgraph of HG whose 
minimum degree is at least m whp, or the existence of an m-core whp, which can be 
achieved by an analysis of the following standard m-core detecting algorithm: while there 
exists any node with degree less than m, randomly select such a node and delete it together 
with all hyperedges containing it, if there is no node left then output No, otherwise output 
the remaining subgraph. 

Let Xi denotes the number of nodes whose degree are less than m after deleting the ith 
node. Let Wij = {u\u has degree j after deleting the zth node}, then Xi = \Wi^i\ + \Wi^2\ + 
m—i\- Obviously, an m-core exists if and only if the node-hyperedge deletion process 
cannot delete all nodes, and if and only if there exists a j < n, such that Xj = 0. Since 

Pr{width{HG) > m) = Pr(3j < n,Xj = 0) > Pr(Xo + \Wo^m\ < A 3j < n,Xj = 0) 

= Pr(Xo + \Wo,m\ < n^) ■ Pr(3j <n,Xj = 0\Xo + \Wo,m\ < n^), (5) 
where 6 G (0, 1) will be determined later, we only need to estimate the last two probabilities. 

Whenever we add a hyperedge to HG with repetition, a node's degree increases by 1 with a 
probability of k/n. So the degree of each node in HG is a random variable with distribution 
B(rnlnn, k/n). By Chernoff bound, for a specific node u, we have 

Pr(n's degree is not more than m) < n~^^'^ . 
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So E(Xo + \Wo-m\) < n ■ n-^"^ ^ ^i-kre 12 _ rp^g^ Markov inequality, we have Pr(Xo + 
|VFo,m| > n^) < E(Xo + \Wo.m\)/n^ < n^-kr^'/^-^ so for 5 G (1 - kre^/2, 1), we have 

Pr(Xo + \Wo,m\ <n^) = l- Pr(Xo + \Wo,m\ >n^)>l- (6) 

Now assume that Xq + |M/o,m| < , where 1 — < 5 < 1. When deleting the 

(i + l)th node, at most (m — 1) hyperedges are deleted together, which contain at most 
(m — l)(/c — 1) other nodes, among which only the m-degree nodes will count for ^j+i- Since 
any subhypergraph with a given degree sequence is uniformly random, see for example [29], 
such a subhypergraph can be generated according to the configuration model [29], so the 
probability that one deleted hyperedge containing an m-degree node is 

qi = m\Wi^m\/'^j\Wi^j\. 

Let Ti be a random variable with distribution B((m — 1)(A; — then the sequence of 

random variables Xo,Xi, ... can be discribed as 

Xq < and Xi+i < - 1 + T^. 
Since |PVo,m| < -^o + l^o,m[ < and ^j>i j|Wo,j| = krnlnn, we have 

(m - \){k - 1)^0 < ((1 - e)kr\un- \)(k - \)- -f^- = o(l). 

rCVTl 111 Tl 

After deleting the ith node, comparing with the beginning of the node-hyperedge deletion 
process, the number of m-degree node increases by at most (m — l){k — and the sum 
^j>ij\Wij\ decreases by at most (m — So for all i < , where 6' G {6, 1), we have 

^ (m-l)(fc-l)m(|Tyo.^| + (m-l)(fc-l)n^') 
[m — i){K — i)qi ^ - - - 

krnlnn — [m — Ijn" 

^ (m - l){k - l)m{n^ + (m - l){k - l)n^') _ 
krnlnn — (m — l)n^' 

Thus, E(Tj) = (m — l){k — l)qi can be arbitrary small. Without loss of generality, let q 
be determined by (m — l)(fc — l)q = 1/2. Let -Dj be a random variable with distribution 
B((m — 1)(A; — l),q)- We now define a new sequence of random variables Yo,Yi, ... by 

Yo = n^ andYi+i = Yi-l + D,. 

Clearly, for all i < n^ , Xi is statistically dominated by Yi, and Yli=i'^i distributes as 
B(n'^ (m — l){k — l),q). Therefore, 

Pr(3j <n,Xj = 0\Xo + \Wo^^\ < n^) > Pr(3j <n,Xj = 0\ Xq < n^) 
> Pr(3j < n^',Yj = 0\ Yo = n^)> Pr y„,/ < 0) = Pr(^ A < n^' - n^) 

i=l 

= l-Pr(V A > n^'-n^) > 1 - Pr( V A > 2/371"^') > I - exp{ n^') = l-o(l), (7) 

'f—f ^ 54 

1=1 1=1 

the last second step above is by Chernoff bound. The lemma follows from ([5]), ([6]) and 
©. □ 
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5 Discussions 



We have proved that in some random CSP models (Model RB/RD), the backtrack- freeness 
threshold rbj in Theorem 13.51 not only exists, but also has a fixed ratio to the satisfiability 
threshold in Theorem 12. 11 that is, r^j = where k is the number of variables in each 
constraints. 

The first implications is on the power of greedy algorithms. A CSP algorithm is called 
greedy, if at each step we choose an unassigned variable by some rule and assign an available 
value for it, here by availability we mean that the extended partial assignment is compatible 
with all constraints among all assigned variables. The availability is a natural feature in 
common greedy algorithms. A greedy algorithm succeeds on an instance if all variables can 
be assigned in this way, fails otherwise. To specify a greedy algorithm, we need to specify 
the rule to choose the next variable from unassigned variables and the rule to choose an 
available value for the variable. In turn, every greedy algorithm specifies a linear ordering, 
called induced ordering, on all variables in an instance, and the width of the induced ordering 
on constraint graph can be called the width of the greedy algorithm on this instance. Note 
that some greedy algorithms have a fixed linear ordering not depending on instances thus a 
fixed width. For others, we can define the width of the greedy algorithm as the maximum 
width over all instances. 

If an instance is backtrack- free under an ordering vr, then every greedy algorithm as de- 
scribed above with induced ordering tt will succeeds on this instance, no matter how to 
choose an available value for each variable. Moreover, if an instance is vertex-centered t- 
consistent, then every greedy algorithm as described above with induced width no greater 
than t will succeed on this instance, no matter how to choose an available value for each 
variable. As far as we know, this is the first time to define explicitly the width of a greedy 
CSP algorithm and relate it to the power of greedy algorithms on CSPs. 

As a concrete example to the above discussion, let us consider Model RB/RD. On the one 
hand. Model RB/RD is A'^P-complete for all positive values for the density parameter r. 

On the other hand, at least in a constant portion to the satisfiable range of values for 
parameter r (that is, r < r^j = there is an easily determined ordering of variables such 
that almost surely, every greedy algorithm following that ordering will succeed on almost all 
instances of Model RB/RD, in sharp contrast to its worst-case complexity. When A; = 2, at 
least in half portion to the satisfiable range of values for parameter r (that is, r < r^j- = ^), 
almost all instances can be easily solved by greedy algorithms. While for instances above 
Tf, f , with high probability, there does not exist such an ordering to guarantee the success of 
every greedy algorithm. This implies that the exact threshold of backtrack-freeness obtained 
in this paper can also be viewed as a threshold for the power of greedy algorithms. 

The second implication is about the satisfiability threshold for random CSPs. For Model 
RB/RD, the exact threshold of satisfiability is rcr = ~i^[(f^ (Theorem 1 in [IB]), which 
is independent of k, the number of variables in each constraint, while the exact threshold 
of backtrack-freeness is rf,j = — f,i^(\_p^ = which decreases with k. For fixed k, these 
two thresholds have a fixed ratio k, so an exact link between them exists. Note that the 
backtrack-freeness threshold also coincides with the threshold of vertex-centered consistency, 



11 



a local property. So our results show an evidence that for random CSPs, the exact threshold 
of satisfiability might has links to thresholds of some local properties, say local consistency. 
Based on this evidence, we propose the following two steps to attack the notorious problem 
of determining the satisfiability threshold for random 3-SAT. 

• Step 1: reduce the satisfiability threshold to some local property (say local consis- 
tency) threshold. 

• Step 2: determine the local property threshold. 

Since reductions are commonly used in computer science and local properties are usually 
easier to handle than global properties, hopefully the two steps each will be easier than 
directly attacking the original satisfiability threshold problem. 
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